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Sequence Listing 



SEQ ID NO: 1 

the major 17 amino acid repeat, 

GluGlnGlnSerAspLeuGluGlnGluArgLeuAlaLysGluLysLeuGln (SEQ ID NO:l) 



SEQ ID NO:2 
minor repeat 

GluGlnGlnArgAspLeuGluGlnGluArgLeuAlaLysGluLysLeuGln (SEQ ID NO:2) 



SEQ ID NO:3 

DNA sequence of the gene Isa-nrc 



Hmut 



Arg mutant 



61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 



ATGGGTACCA ACAGCGAAAA AGACGAAATT ATCAAAAGCA ATCTCCGCTC CGGCAGCTCC 
AACAGCCGCA ACCGCATCAA CGAGGAAAAG CATGAGAAGA AACATGTGCT GAGCCACAAC 
TCCTACGAGA AGACTAAAAA CAACGAAAAC AACAAATTCT TTGACAAGGA CAAAGAGCTG 
ACGATGAGCA ACGTTAAAAA CGTATCCCAG ACCAACTTTA AATCCCTCCT GCGCAACCTC 
GGCGTTTCCG AGAACATCTT TCTCAAAGAA AACAAACTGA ACAAGGAAGG CAAACTGATT 
GAACATATCA TCAACGACGA CGATGACAAA AAAAAATACA TTAAAGGCCA GGATGAAAAT 
CGCCAGGAAG ACCTCGAAGA AAAAGCTGCT GAACAGCAGT CGGACCTGGA ACAGGAGCGC 
CTCGCTAAAG AAAAGCTCCA GGAGCGCCTC GCTAAAGAAA AGCTCCAGGA GCAACAGCGC 
GACCTGGAAC AGCGCAAGGC TGACACGAAA AAAAACCTGG AACGCAAAAA GGAACACGGC 
GACGTTCTGG CTGAGGACCT GTACGGCCGC CTGGAAATCC CAGCTATCGA ACTCCCATCC 
GAAAACGAAC GCGGCTACTA CATCCCACAC CAGAGCAGCC TGCCACAAGA TAATCGCGGG 
AACTCCCGCG ACAGTAAGGA AATCAGCATC ATCGAAAAAA CCAACCGCGA AAGCATTACC 
ACCAACGTGG AAGGCCGCCG CGACATCCAC AAAGGCCACC TCGAAGAAAA GAAAGACGGC 
TCCATCAAAC CAGAACAGAA AGAAGACAAA AGCGCTGATA TCCAGAACCA CACCCTGGAG 
ACCGTGAACA TTAGCGACGT GAACGACTTC CAGATCAGCA AGTACGAGGA CGAAATCTCC 
GCTGAATACG ATGACTCCCT GATCGACGAA GAAGAAGACG ACGAAGATCT GGATGAATTC 
AAACCAATTG TCCAGTACGA TAACTTTCAG GACGAAGAAA ATATCGGCAT TTACAAAGAA 
CTCGAAGACC TCATCGAGAA AAACGAAAAC CTGGACGACC TGGACGAAGG CATCGAAAAA 
TCCTCCGAAG AACTGAGCGA AGAAAAAATC AAAAAAGGCA AGAAATACGA AAAAACCAAG 
GACAACAACT TCAAACCAAA CGACAAATCC CTCTACGACG AGCACATTAA AAAATACAAA 
AACGACAAGC AAGTGAACAA GGAAAAGGAA AAATTTATCA AATCCCTCTT CCACATCTTC 
GATGGCGATA ACGAAATTCT GCAAATTGTA GACGAACGGT TGAGCGAAGA CATCACTAAA 
TACTTCATGA AGCTTGGGGG CTCCGGTTCT CCACACCACC ACCACCACCA C TGA 
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SEQ ID N0:4 

LSA-NRC (H) Mut Protein 

MetGlyThrAsnSerGluLysAspGluIlelleLysSerAsnLeuArgS 
erGlySerSerAsnSerArgAsnArg 2 5 

IleAsnGluGluLysHisGluLysLysHisValLeuSerHisAsnSerT 
yrGluLysThrLysAsnAsnGluAsn 5 0 

AsnLysPhePheAspLysAspLysGluLeuThrMetSerAsnValLysA 
snValSerGlnThrAsnPheLysSer 75 

LeuLeuArgAsnLeuGlyValSerGluAsnllePheLeuLysGluAsnL 
ysLeuAsnLysGluGlyLysLeuI 1 e 100 

GluHisIlelleAsnAspAspAspAspLysLysLysTyrlleLysGlyG 
InAspGluAsnArgGlnGluAspLeu 12 5 

GluGluLysAlaAlaGluGlnGlnSerAspLeuGluGlnGluArgLeuA 
laLysGluLysLeuGlnGluArgLeu 150 

AlaLysGluLysLeuGlnGluGlnGlnArgAspLeuGluGlnArgLysA 
1 aAspThrLysLysAsnLeuGluArg 175 

LysLysGluHisGlyAspValLenAlaGluAspLeuTyrGlyArgLeuG 
luIleProAlalleGluLeuProSer 2 00 

GluAsnGluArgGlyTyrTyrlleProHisGlnSerSerLeuProGlnA 
spAsnArgGlyAsnSerArgAspSer 225 

LysGluIleSerllelleGluLysThrAsnArgGluSerlleThrThrA 
snValGluGlyArgArgAsp 1 1 eHi s 250 

LysGlyHisLeuGluGluLysLysAspGlySerlleLysProGluGlnL 
ysGluAspLysSerAlaAsp I 1 eGln 275 

AsnHisThrLeuGluThrValAsnlleSerAspValAsnAspPheGlnl 
leSerLysTyrGluAspGluIleSer 3 00 

AlaGluTyrAspAspSerLeuIleAspGluGluGluAspAspGluAspL 
euAspGluPheLysProI leValGln 325 

TyrAspAsnPheGlnAspGluGluAsnlleGlylleTyrLysGluLeuG 
luAspLeuI 1 eGluLysAsnGluAsn 350 

LeuAspAspLeuAspGluGlylleGluLysSerSerGluGluLeuSerG 
luGluLys I 1 eLysLy sGlyLysLy s 3 75 

Tyi'GluLysThrLysAspAsnAsnPheLysProAsnAspLysSerLeuT 
yrAspGluHi s I leLysLysTyrLys 4 00 
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AsnAspLysGlnValAsnLysGluLysGluLysPhelleLysSerLeuP 
heHisIlePheAspGlyAspAsnGlu 425 

IleLeuGlnlleValAspGluArgLeuSerGluAspIleThrLysTyrP 
heMetLysLeuGlyGlySerGlySerPro 450 

HisHisHisHisHisHis 456 



SEQ ED NO:5 

conserved amino acids of the same basic 17 amino acids following the order: 
XiGlnGlnXjAspX^ ID NO:5) where xi is 

either Glu or Gly; x 2 is Ser or Arg; x 3 is Asp or Ser; X4 is Glu or Asp; x 5 is Leu or Arg; x 6 
is Lys or Asn and x 7 is Lys or Thr or Arg. 

SEQ ID NO:6 

LeuThrMetSerAsnValLysAsnValSerGlnThrAsnPheLysSerLeuLeuArgAsnLeuGlyValSer 
(SEQIDNO:6) 

SEQ ID NO:7 

GluGlnGlnSerAspLeuGluGlnGluArgLeuAlaLysGluLysLeuGln (SEQ ID NO:7) 
SEQ ID NO: 8 

GluArgLeuAlaLysGluLysLeuGlnGluGlnGlnArgAspLeuGluGln (SEQ ID NO:8) 
SEQ ID NO:9 

ThrLysLysAsnLeuGluArgLysLysGluHisGlyAspValLeuAlaGluAspLeuTyr (SEQ ID NO:9) 
SEQ ID NO: 10 

AsnSerAjgAspSerLysGluIleSerllelleGliiLysThrAsnArgGluSerlleThrThrAsnValGlu 
GlyArgArgAspIleHisLysGlyHisLeu (SEQ ID NO: 10) 

SEQ ID NO: 11 

LysProIleValGlnTyrAspAsnPhe (SEQ ID NO:l 1) 
SEQ ID NO: 12 

AsnGluAsnLeuAspAspLeuAspGluGlylleGluLysSerSerGluGluLeuSerGluGluLys 
He (SEQ ID NO: 12) 

SEQ ID NO: 13 

LysProAsnAspLysSerLeu (SEQ ID NO: 13) 
SEQ ID NO: 14 

AspAsnGluIleLeuGlnlleValAspGluLeuSerGluAspIleThrLysTyrPheMetLysLeu 
(SEQ ID NO: 14) 

SEQ ID NO: 15 

AspAsnGluIleLeuGlnlleValAspGluArgLeuSerGluAspIleThrLysTyrPheMetLysLeu 
(SEQ ID NO: 15) 
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SEQIDN0:16 

LeuThrMetSerAsnValLysAsnValSerGlnThrAsnPheLysSerLeuLeuArgAsnLeuGlyValSer 
(SEQIDN0:16) 

SEQIDN0:17 

HisThrLeuGluThrValAsnlleSerAspValAsnAspPheGlnlleSerLysTyrGlu (SEQ ID NO: 17) 
SEQIDNO:18 

AspGluAspLeuAspGluPheLysProIleValGlnTyrAspAsnPheGlnAsp (SEQ ID NO: 18) 
SEQ ID NO: 19 

IleGlylleTyrLysGluLeuGluAspLeuIleGluLys (SEQ ID NO: 19) 
SEQ ID NO:20 

AsnGluAsnLeuAspAspLeuAspGluGlylleGluLysSerSerGluGluLeuSerGluGluLysIle 
(SEQ ID NO:20) 

SEQIDNO:21 

IleLysLysGlyLysLysTyrGluLysThrLysAspAsnAsnPhe (SEQ ID NO:21) 
SEQ ID NO:22 

AspAsnGluIleLeuGlnlleValAspGluLeuSerGluAspIleThrLysTyrPheMetLysLeu (SEQ ID NO:22) 
SEQ ID NO:23 

TyiTyrlleProHisGlnSerSerLeu (SEQ ID NO:23) 
SEQ ID NO:24 

Amino acid sequence of LSA-NRC(H) repeat sequence between N & C terminals. 
GluGlnGlnSerAspLeuGluGlnGluArgLeuAlaLysGluLysLeuGln 

GluArgLeuAlaLysGluLysLeuGlnGluGlnGlnArgAspLeuGluGln (SEQ ID NO:24). 
SEQ ID NO:25 

DNA sequence of the gene Isa- nrc H 

1 ATGGGTACCA ACAGCGAAAA AGACGAAATT ATCAAAAGCA ATCTCCGCTC CGGCAGCTCC 

61 AACAGCCGCA ACCGCATCAA CGAGGAAAAG CATGAGAAGA AACATGTGCT GAGCCACAAC 

121 TCCTACGAGA AGACTAAAAA CAACGAAAAC AACAAATTCT TTGACAAGGA CAAAGAGCTG 

181 ACGATGAGCA ACGTTAAAAA CGTATCCCAG ACCAACTTTA AATCCCTCCT GCGCAACCTC 

241 GGCGTTTCCG AGAACATCTT TCTCAAAGAA AACAAACTGA ACAAGGAAGG CAAACTGATT 

301 GAACATATCA TCAACGACGA CGATGACAAA AAAAAATACA TTAAAGGCCA GGATGAAAAT 

361 CGCCAGGAAG ACCTCGAAGA AAAAGCTGCT GAACAGCAGT CGGACCTGGA ACAGGAGCGC 

4 21 CTCGCTAAAG AAAAGCTCCA GGAGCGCCTC GCTAAAGAAA AGCTCCAGGA GCAACAGCGC 

4 81 GACCTGGAAC AGCGCAAGGC TGACACGAAA AAAAACCTGG AACGCAAAAA GGAACACGGC 
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541 GACGTTCTGG CTGAGGACCT GTACGGCCGC 

601 GAAAACGAAC GCGGCTACTA CATCCCACAC 

661 AACTCCCGCG ACAGTAAGGA AATCAGCATC 

721 ACCAACGTGG AAGGCCGCCG CGACATCCAC 

781 TCCATCAAAC CAGAACAGAA AGAAGACAAA 

841 ACCGTGAACA TTAGCGACGT GAACGACTTC 

901 GCTGAATACG ATGACTCCCT GATCGACGAA 

961 AAACCAATTG TCCAGTACGA TAACTTTCAG 

1021 CTCGAAGACC TCATCGAGAA AAACGAAAAC 

1081 TCCTCCGAAG AACTGAGCGA AGAAAAAATC 

1141 GACAACAACT TCAAACCAAA CGACAAATCC 

1201 AACGACAAGC AAGTGAACAA GGAAAAGGAA 

1261 GATGGCGATA ACGAAATTCT GCAAATTGTA 

1321 TTCATGAAGC TTGGGGGCTC CGGTTCTCCA 

SEQ ID NO:26 



CTGGAAATCC CAGCTATCGA ACTCCCATCC 
CAGAGCAGCC TGCCACAAGA TAATCGCGGG 
ATCGAAAAAA CCAACCGCGA AAGCATTACC 
AAAGGCCACC TCGAAGAAAA GAAAGACGGC 
AGCGCTGATA TCCAGAACCA CACCCTGGAG 
CAGATCAGCA AGTACGAGGA CGAAATCTCC 
GAAGAAGACG ACGAAGATCT GGATGAATTC 
GACGAAGAAA ATATCGGCAT TTACAAAGAA 
CTGGACGACC TGGACGAAGG CATCGAAAAA 
AAAAAAGGCA AGAAATACGA AAAAACCAAG 
CTCTACGACG AGCACATTAA AAAATACAAA 
AAATTTATCA AATCCCTCTT CCACATCTTC 
GACGAACTGA GCGAAGACAT CACTAAATAC 
CACCACCACC ACCACCACTG A 



LSA-NRC Protein 

MetGlyThrAsnSerGluLysAspGluIlelleLysSerAsnLeuArgS 
erGlySerSerAsnSerArgAsnArg 25 

IleAsnGluGluLysHisGluLysLysHisValLeuSerHisAsnSerT 
yrGluLysThrLysAsnAsnGluAsn 5 0 

AsnLysPhePheAspLysAspLysGluLeuThrMetSerAsnValLysA 
s nVa 1 S e r G 1 nThr As nPhe Ly s S e r 75 

LeuLeuArgAsnLeuGlyValSerGluAsnllePheLeuLysGluAsnL 
ysLeuAsnLysGluGlyLysLeuI le 100 

GluHisIlelleAsnAspAspAspAspLysLysLysTyrlleLysGlyG 
InAspGluAsnArgGlnGluAspLeu 125 

GluGluLysAlaAlaGluGlnGlnSerAspLeuGluGlnGluArgLeuA 
laLysGluLysLeuGlnGluArgLeu 150 

AlaLysGluLysLeuGlnGluGlnGlnArgAspLeuGluGlnArgLysA 
1 aAspThrLysLy sAsnLeuGluArg 175 

LysLysGluHisGlyAspValLeuAlaGluAspLeuTyrGlyArgLeuG 
luIleProAlalleGluLeuProSer 2 00 



GluAsnGluArgGlyTyrTyrlleProHisGlnSerSerLeuProGlnA 
spAsnArgGlyAsnSerArgAspSer 22 5 

LysGluIleSerllelleGluLysThrAsnArgGluSerlleThrThrA 
snValGluGlyArgArgAspIleHis 250 

LysGlyHisLeuGluGluLysLysAspGlySerlleLysProGluGlnL 
ysGluAspLysSerAlaAspIleGln 2 75 

AsnHisThrLeuGluThrValAsnlleSerAspValAsnAspPheGlnl 
leSerLysTyrGluAspGluIleSer 3 00 

AlaGluTyrAspAspSerLeuIleAspGluGluGluAspAspGluAspL 
euAspGluPheLysProIleValGln 32 5 

TyrAspAsnPheGlnAspGluGluAsnlleGlylleTyrLysGluLeuG 
luAspLeuI 1 eGluLysAsnGluAsn 350 



LeuAspAspLeuAspGluGlylleGluLysSerSerGluGluLeuSerG 
luGluLysIleLysLysGlyLysLys 3 75 

TyrGluLysThrLysAspAsnAsnPheLysProAsnAspLysSerLeuT 
yrAspGluHi s 1 1 eLysLysTyrLys 4 00 

AsnAspLysGlnValAsnLysGluLysGluLysPhelleLysSerLeuP 
heHisIlePheAspGlyAspAsnGlu 425 

IleLeuGlnlleValAspGluLeuSerGluAspIleThrLysTyrPheM 
etLysLeuGlyGlySerGlySerPro 450 

HisHisHisHisHisHis 456 

SEQ ID NO: 27 

VSQTNFKSL (SEQ ID NO: 27) 

SEQ ID NO: 28 

SQTNFKSL (SEQ ID NO: 28), 
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